Hierarchical clustering for multiple nominal data streams with evolving behaviour
نویسندگان
چکیده
Abstract Over the decade, a number of attempts have been made towards data stream clustering, but most works fall under clustering by example approach. There are applications where variable approach is required which involves multiple streams as opposed to examples in stream. Furthermore, few presented for and these applicable numeric only. Hence, this research gap has motivated current work. In present work, hierarchical technique proposed cluster nominal. To address concept changes splitting merging clusters structure performed. The decision split or merge based on entropy measure, representing cluster’s degree disparity. performance analysed compared Agglomerative Nesting synthetic well real-world dataset terms Dunn Index, Modified Hubert $$\varGamma $$ Γ statistic, Cophenetic Correlation Coefficient, Purity. outperforms evolving streams. effect evolution average visualised detailed analysis understanding.
منابع مشابه
A Framework for Clustering Evolving Data Streams
The clustering problem is a difficult problem for the data stream domain. This is because the large volumes of data arriving in a stream renders most traditional algorithms too inefficient. In recent years, a few one-pass clustering algorithms have been developed for the data stream problem. Although such methods address the scalability issues of the clustering problem, they are generally blind...
متن کاملHierarchical Time-Series Clustering for Data Streams⋆
This paper presents a time-series whole clustering system that incrementally constructs a hierarchy of clusters. The Online DivisiveAgglomerative Clustering (ODAC) system is an incremental implementation of divisive analysis clustering, using the correlation between timeseries as similarity measure. The system tests existing clusters by descending order of diameters, looking for a possible bina...
متن کاملDistributed Weighted Clustering of Evolving Sensor Data Streams with Noise
Collecting data from sensor nodes is the ultimate goal of Wireless Sensor Networks. This is performed by transmitting the sensed measurements to some data collecting station. In sensor nodes, radio communication is the dominating consumer of the energy resources which are usually limited. Summarizing the sensed data internally on sensor nodes and sending only the summaries will considerably sav...
متن کاملHIERARCHICAL DATA CLUSTERING MODEL FOR ANALYZING PASSENGERS’ TRIP IN HIGHWAYS
One of the most important issues in urban planning is developing sustainable public transportation. The basic condition for this purpose is analyzing current condition especially based on data. Data mining is a set of new techniques that are beyond statistical data analyzing. Clustering techniques is a subset of it that one of it’s techniques used for analyzing passengers’ trip. The result of...
متن کاملClustering Based Active Learning for Evolving Data Streams
Data labeling is an expensive and time-consuming task. Choosing which labels to use is increasingly becoming important. In the active learning setting, a classifier is trained by asking for labels for only a small fraction of all instances. While many works exist that deal with this issue in non-streaming scenarios, few works exist in the data stream setting. In this paper we propose a new acti...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Complex & Intelligent Systems
سال: 2022
ISSN: ['2198-6053', '2199-4536']
DOI: https://doi.org/10.1007/s40747-021-00634-0